Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: aibek Date: Dec 18, 2012 9:06am
Forum: texts Subject: Re: zipped jpg directory won't derive

The cause of the latest failure, at least, is clear. The log file you linked to shows that the earlier scandata.xml file was being used together with your new _images.zip file, which caused problems. This latest problem can be solved by deleting all the derived files (as I said later).

Why upload with _jpg.zip is failing is more involved. Could it be that something is wrong with your _0000.jpg file? When you upload the files with the name _images.zip, the images files are first converted into jp2 format, and OCR is run on the jp2 files. When the files are uploaded with the name _jpg.zip, OCR is run over the jpg files directly. I can imagine one point where the discrepancy can creep in: your 0000.jpg file has some problem, so OCR on it fails, but the jpg-to-jp2 converter is “forgiving”, so it eats up the bad jpg file to output good jp2, on which the OCR processor runs without complaining. Can you upload your 0000.jpg and 0001.jpg files here so that I can check (on the forum, “Add files” while replying)?

At any rate, you know what to do! Upload your files as _images.zip!

To summarize: Name all your files as *_0000.jpg, *0001.jpg and so on (or *_0000.tif, *_0001.tif, and so on), collect them all in *_images.zip, and upload this file. And, if you are changing this file, delete all the derivatives first.

(By the way, if your scanner produces images in tif, you should upload the tifs. Jpg/Jpeg is a “lossy compression” format -- conversion from tif to jpg decreases the image quality. Tif/Tiff and Bmp/bitmap are “lossless compression” formats.)

Reply to this post
Reply [edit]

Poster: drexelmedarchives Date: Dec 18, 2012 10:25am
Forum: texts Subject: Re: zipped jpg directory won't derive

Well, in working with several 30-day test objects I will tell you I did have one _jpg.zip file work, with just a few images. But no success subsequently (until today, with another test file).

Jeff responded to my other post and hence fixed the problem, with this comment - "the item has been fixed. The _images.zip needed to have Generic Raw Book Zip chosen as it's format in the Edit item page in the Files and Formats section prior to running a derive." So now I know to look for that, too.

Thanks for your details on the process (did not stick with the error log long enough to pull apart all that was going on). I submitted this with the first 2 jpg files and it seems neither the post nor the files uploaded so I'm abandoning the attached files - as you say, I'll stick with *_images.zip for uploads. Presumably this will be fine if we switch back to uploading tiffs (had started with tiffs but had to tar it and then started with various problems, so wasn't sure if the tiffs or the tar file were problematic, so bumped down to jpgs).

Whew. It's great when it works, though! Many thanks!!

Terms of Use (31 Dec 2014)